84 research outputs found
ARIES: A Corpus of Scientific Paper Edits Made in Response to Peer Reviews
Revising scientific papers based on peer feedback is a challenging task that
requires not only deep scientific knowledge and reasoning, but also the ability
to recognize the implicit requests in high-level feedback and to choose the
best of many possible ways to update the manuscript in response. We introduce
this task for large language models and release ARIES, a dataset of review
comments and their corresponding paper edits, to enable training and evaluating
models. We study two versions of the task: comment-edit alignment and edit
generation, and evaluate several baselines, including GPT-4. We find that
models struggle even to identify the edits that correspond to a comment,
especially in cases where the comment is phrased in an indirect way or where
the edit addresses the spirit of a comment but not the precise request. When
tasked with generating edits, GPT-4 often succeeds in addressing comments on a
surface level, but it rigidly follows the wording of the feedback rather than
the underlying intent, and includes fewer technical details than human-written
edits. We hope that our formalization, dataset, and analysis will form a
foundation for future work in this area.Comment: 11 pages, 2 figure
Pulmonary delivery of vancomycin dry powder aerosol to intubated rabbits
TGX-221 is a potent, selective, and cell membrane permeable inhibitor of the PI3K p110β catalytic subunit. Recent studies showed that TGX-221 has anti-proliferative activity against PTEN-deficient tumor cell lines including prostate cancers. The objective of this study was to develop an encapsulation system for parenterally delivering TGX-221 to the target tissue through a prostate-specific membrane aptamer (PSMAa10) with little or no side effects. In this study, PEG-PCL micelles were formulated to encapsulate the drug, and a prodrug strategy was pursued to improve the stability of the carrier system. Fluorescence imaging studies demonstrated that the cellular uptake of both drug and nanoparticles were significantly improved by targeted micelles in a PSMA positive cell line. The area under the plasma concentration time curve of the micelle formulation in nude mice was 2.27-fold greater than the naked drug, and the drug clearance rate was 17.5-fold slower. These findings suggest a novel formulation approach for improving site-specific drug delivery of a molecular-targeted prostate cancer treatment
Concordant Gene Expression in Leukemia Cells and Normal Leukocytes Is Associated with Germline cis-SNPs
The degree to which gene expression covaries between different primary tissues within an individual is not well defined. We hypothesized that expression that is concordant across tissues is more likely influenced by genetic variability than gene expression which is discordant between tissues. We quantified expression of 11,873 genes in paired samples of primary leukemia cells and normal leukocytes from 92 patients with acute lymphoblastic leukemia (ALL). Genetic variation at >500,000 single nucleotide polymorphisms (SNPs) was also assessed. The expression of only 176/11,783 (1.5%) genes was correlated (p<0.008, FDR = 25%) in the two tissue types, but expression of a high proportion (20 of these 176 genes) was significantly related to cis-SNP genotypes (adjusted p<0.05). In an independent set of 134 patients with ALL, 14 of these 20 genes were validated as having expression related to cis-SNPs, as were 9 of 20 genes in a second validation set of HapMap cell lines. Genes whose expression was concordant among tissue types were more likely to be associated with germline cis-SNPs than genes with discordant expression in these tissues; genes affected were involved in housekeeping functions (GSTM2, GAPDH and NCOR1) and purine metabolism
The Semantic Reader Project: Augmenting Scholarly Documents through AI-Powered Interactive Reading Interfaces
Scholarly publications are key to the transfer of knowledge from scholars to
others. However, research papers are information-dense, and as the volume of
the scientific literature grows, the need for new technology to support the
reading process grows. In contrast to the process of finding papers, which has
been transformed by Internet technology, the experience of reading research
papers has changed little in decades. The PDF format for sharing research
papers is widely used due to its portability, but it has significant downsides
including: static content, poor accessibility for low-vision readers, and
difficulty reading on mobile devices. This paper explores the question "Can
recent advances in AI and HCI power intelligent, interactive, and accessible
reading interfaces -- even for legacy PDFs?" We describe the Semantic Reader
Project, a collaborative effort across multiple institutions to explore
automatic creation of dynamic reading interfaces for research papers. Through
this project, we've developed ten research prototype interfaces and conducted
usability studies with more than 300 participants and real-world users showing
improved reading experiences for scholars. We've also released a production
reading interface for research papers that will incorporate the best features
as they mature. We structure this paper around challenges scholars and the
public face when reading research papers -- Discovery, Efficiency,
Comprehension, Synthesis, and Accessibility -- and present an overview of our
progress and remaining open challenges
Finishing the euchromatic sequence of the human genome
The sequence of the human genome encodes the genetic instructions for human physiology, as well as rich information about human evolution. In 2001, the International Human Genome Sequencing Consortium reported a draft sequence of the euchromatic portion of the human genome. Since then, the international collaboration has worked to convert this draft into a genome sequence with high accuracy and nearly complete coverage. Here, we report the result of this finishing process. The current genome sequence (Build 35) contains 2.85 billion nucleotides interrupted by only 341 gaps. It covers ∼99% of the euchromatic genome and is accurate to an error rate of ∼1 event per 100,000 bases. Many of the remaining euchromatic gaps are associated with segmental duplications and will require focused work with new methods. The near-complete sequence, the first for a vertebrate, greatly improves the precision of biological analyses of the human genome including studies of gene number, birth and death. Notably, the human enome seems to encode only 20,000-25,000 protein-coding genes. The genome sequence reported here should serve as a firm foundation for biomedical research in the decades ahead
N-Acylethanolamines in human reproductive fluids
N-Acylethanolamines (NAEs) are an important family of lipid-signaling molecules. Arachidonylethanolamide (anandamide) (AEA), palmitoylethanolamide (PEA), and oleoylethanolamide (OEA) are co-produced from similar phospholipid precursors when neurons are stimulated. AEA is an endogenous agonist (endocannabinoid) for cannabinoid receptors. It binds with higher affinity to type CB1 than to type CB2 cannabinoid receptors. PEA does not bind to CB1, while the hypothesis that it reacts with putative CB2-like receptors has been questioned. OEA does not activate currently known cannabinoid receptors, but it mimics the effects of AEA and cannabinoids in reducing the fertilizing capacity of sea urchin sperm. OEA and PEA also act as entourage compounds by inhibiting the hydrolysis of AEA by fatty acid amide hydrolase. Cannabinoid receptors and/or AEA are present in mammalian reproductive organs including the testis, epididymis, prostate, ovary, uterus, sperm, preimplantation embryo and placenta, as well as prostatic and mammary carcinomas. We now report that analysis by high-performance liquid chromatography/mass spectrometry (HPLC/MS) shows the presence of AEA, PEA, and OEA in human seminal plasma, mid-cycle oviductal fluid, follicular fluid, amniotic fluid, milk, and fluids from malignant ovarian cysts. Previous studies showed that AEA-signaling via cannabinoid receptors regulates capacitation and fertilizing potential of human sperm, early embryonic development and blastocyst implantation into the uterine mucosa of rodents, as well as proliferation of human mammary and prostatic carcinomas. Current results imply that NAEs also may modulate follicular maturation and ovulation, normal and pathological ovarian function, placental and fetal physiology, lactation, infant physiology, and behavior. Collectively, these findings suggest that NAEs in human reproductive fluids may help regulate multiple physiological and pathological processes in the reproductive system, and imply that exogenous cannabinoids delivered by marijuana smoke might impact these processes. This study has potential medical and public policy ramifications because of the incidence of marijuana abuse by adolescents and adults in our society, previously documented reproductive effects of marijuana, and the ongoing debate about medicinal use of marijuana and cannabinoids
ACCoRD: A Multi-Document Approach to Generating Diverse Descriptions of Scientific Concepts
Systems that can automatically define unfamiliar terms hold the promise of
improving the accessibility of scientific texts, especially for readers who may
lack prerequisite background knowledge. However, current systems assume a
single "best" description per concept, which fails to account for the many
potentially useful ways a concept can be described. We present ACCoRD, an
end-to-end system tackling the novel task of generating sets of descriptions of
scientific concepts. Our system takes advantage of the myriad ways a concept is
mentioned across the scientific literature to produce distinct, diverse
descriptions of target scientific concepts in terms of different reference
concepts. To support research on the task, we release an expert-annotated
resource, the ACCoRD corpus, which includes 1,275 labeled contexts and 1,787
hand-authored concept descriptions. We conduct a user study demonstrating that
(1) users prefer descriptions produced by our end-to-end system, and (2) users
prefer multiple descriptions to a single "best" description
- …